Thesaurus as a complex network

نویسندگان

  • Adriano de Jesus Holanda
  • Ivan Torres Pisa
  • Osame Kinouchi
  • Alexandre Souto Martinez
  • Evandro Eduardo Seron Ruiz
چکیده

A thesaurus is one, out of many, possible representations of term (or word) connectivity. The terms of a thesaurus are seen as the nodes and their relationship as the links of a directed graph. The directionality of the links retains all the thesaurus information and allows the measurement of several quantities. This has lead to a new term classification according to the characteristics of the nodes, for example, nodes with no links in, no links out, etc. Using an electronic available thesaurus we have obtained the incoming and outgoing link distributions. While the incoming link distribution follows a stretched exponential function, the lower bound for the outgoing link distribution has the same envelope of the scientific paper citation distribution proposed by Albuquerque and Tsallis [1]. However, a better fit is obtained by simpler function which is the solution of Ricatti’s differential equation. We conjecture that this differential equation is the continuous limit of a stochastic growth model of the thesaurus network. We also propose a new manner to arrange a thesaurus using the “inversion method”.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

امکان‌سنجی طرح تدوین اصطلاح نامۀ مطالعات زنان و خانواده براساس استاندارد BS ISO 25964-1

Research Objective: Feasibility study of the Family and Women’s Studies Thesaurus considering the expansion of information in the field of women and family studies, as well as the wide span of related vocabulary and the development of vocabulary lists and bibliographies, the Family and Women’s Studies Thesaurus can be a professional tool for indexing and retrieval of women’s information in data...

متن کامل

ارائه روشی جدید برای شاخص‌گذاری خودکار و استخراج کلمات کلیدی برای بازیابی اطلاعات و خوشه‌بندی متون

Persian words in writing with a diverse and cover all modes of grammatical words with the recruitment of a series of specific rules because it is impossible to extract keywords automatically from Persian texts difficult and complex. This thesis has attempted to use linguistic information and thesaurus, keywords Mnatry be provided. Using the symbol system is structured network can be keywords, i...

متن کامل

Lobby index as a network centrality measure

We study the lobby index ( l for short) as a local node centrality measure for complex networks. The l is compared with degree (a local measure), betweenness and Eigenvector centralities (two global measures) in the case of a biological network (Yeast interaction protein-protein network) and a linguistic network (Moby Thesaurus II ). In both networks, the l has poor correlation with betweenness...

متن کامل

بررسی وضعیت نرم‌افزارهای مدیریت و ارائه‌ی اصطلاح‌نامه‌‌ای فارسی

The current study is devoted to investigate softwares for managing and providing Persian thesaurus. Therefore, using survey-descriptive method, we have analyzed five thesaurus management softwares, including the softwares “Islamic Sciences Thesaurus”, “Thesaurus Builder”, “Pars Azarakhsh”, “Ghamoos” and “published version of Ebrahimpoor Thesaurus”, along with four softwares for providing thesau...

متن کامل

Basic word statistics for information retrieval: thesaurus as a complex network

Words are the building blocks to construct sentences and to transmit information. Here, two distinctive hard classification approaches are applied to words. First, we consider words as being the nodes and their relationships as being the links of a directed graph. This permits us define, in a natural manner, the thesaurus conformation. The statistics of the outcoming and incoming links are char...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003